Why?
The Web has lots of stuff
- frontier beyond curated datasets
- stuff is wrapped in HTML
- HTML is transported over HTTP but composed for h2m consumption
- Intellectual Property rights bear serious consideration
API
Application Program Interface
- Built for machine-to-machine interactions
- Instructions for programs
Client / Server
- Make [R] interface with the web
- Same as h2m but now m2m
JSON
- Javascript Object Notation is a language-independent data format
- Currently the most common data data format for asynchronous client/server communication format
- Consists of key-value pairs
# from https://en.wikipedia.org/wiki/JSON
{
"firstName": "John",
"lastName": "Smith",
"isAlive": true,
"age": 25,
"address": {
"streetAddress": "21 2nd Street",
"city": "New York",
"state": "NY",
"postalCode": "10021-3100"
},
"phoneNumbers": [
{
"type": "home",
"number": "212 555-1234"
},
{
"type": "office",
"number": "646 555-4567"
},
{
"type": "mobile",
"number": "123 456-7890"
}
],
"children": [],
"spouse": null
}
Example
Demonstration
library(jsonlite)
# https://cran.r-project.org/web/packages/jsonlite/vignettes/json-aaquickstart.html
# for building tibbles
library(tidyverse)
Single JSON array
When the server response is a single JSON array, JSONlite makes viewing the data pretty simple.
oneJSONresult <- fromJSON("http://www.omdbapi.com/?t=rocky&y=&plot=full&r=json")
# oneJSONresult
Let’s see the results in the next slide
oneJSONresult
$Title
[1] "Rocky"
$Year
[1] "1976"
$Rated
[1] "PG"
$Released
[1] "03 Dec 1976"
$Runtime
[1] "120 min"
$Genre
[1] "Drama, Sport"
$Director
[1] "John G. Avildsen"
$Writer
[1] "Sylvester Stallone"
$Actors
[1] "Sylvester Stallone, Talia Shire, Burt Young, Carl Weathers"
$Plot
[1] "Rocky Balboa is a struggling boxer trying to make the big time, working as a debt collector for a pittance. When heavyweight champion Apollo Creed visits Philadelphia, his managers want to set up an exhibition match between Creed and a struggling boxer, touting the fight as a chance for a \"nobody\" to become a \"somebody\". The match is supposed to be easily won by Creed, but someone forgot to tell Rocky, who sees this as his only shot at the big time."
$Language
[1] "English"
$Country
[1] "USA"
$Awards
[1] "Won 3 Oscars. Another 16 wins & 21 nominations."
$Poster
[1] "https://images-na.ssl-images-amazon.com/images/M/MV5BMTY5MDMzODUyOF5BMl5BanBnXkFtZTcwMTQ3NTMyNA@@._V1_SX300.jpg"
$Metascore
[1] "N/A"
$imdbRating
[1] "8.1"
$imdbVotes
[1] "387,927"
$imdbID
[1] "tt0075148"
$Type
[1] "movie"
$Response
[1] "True"
The vector object behaves as you would expect in R.
- You can list all the variable names.
names(oneJSONresult)
[1] "Title" "Year" "Rated" "Released" "Runtime" "Genre" "Director" "Writer" "Actors"
[10] "Plot" "Language" "Country" "Awards" "Poster" "Metascore" "imdbRating" "imdbVotes" "imdbID"
[19] "Type" "Response"
- List an individual element
oneJSONresult$Title
[1] "Rocky"
oneJSONresult$Awards
A JSON Matrix
The results of this code-snippet react differently between the console, the Notebook script (console), and the Notebook HTML output. In the Notebook script-output you can find the component name, in this case dollar-search: $Search. Or, you can use bracket notation: [[1]]. Once you identify the component name, it’s easier to identify the element names.
jsonSeriesResutlsMatrix <- fromJSON("http://www.omdbapi.com/?s=rocky&type=series&r=json&page=1")
jsonSeriesResutlsMatrix
$Search
$totalResults
[1] "20"
$Response
[1] "True"
Call the search results and coerce the JSON array into a data frame.
jsonSeriesResutlsMatrix$Search
jsonSeriesResutlsMatrix$Search$Title
Resources
- RStudio httR video
- JSONlite package
- listof images
- Movies of 1976
LS0tDQp0aXRsZTogIlVzaW5nIFIgdG8gT3JjaGVzdHJhdGUgQVBJcyINCmF1dGhvcjogIkpvaG4gTGl0dGxlIg0KZGF0ZTogJ2ByIFN5cy5EYXRlKClgJw0Kb3V0cHV0Og0KICBzbGlkeV9wcmVzZW50YXRpb246IGRlZmF1bHQNCiAgaHRtbF9ub3RlYm9vazogZGVmYXVsdA0KLS0tDQojIyBVc2luZyBSIHRvIE9yY2hlc3RyYXRlIEFQSXMNCg0KQSBwcmVzZW50YXRpb24gZm9yIFtSZXNlYXJjaCBEYXRhIGF0IHRoZSBFZGdlXShodHRwOi8vbGlicmFyeS5kdWtlLmVkdS9lZGdlL2V2ZW50cy9yYzE3KSwgRGF5IE9uZSBvZiBbRHVrZSBSZXNlYXJjaCBDb21wdXRpbmcgU3ltcG9zaXVtXShodHRwczovL3JjLmR1a2UuZWR1L3N5bXBvc2l1bS0yMDE3LykNCg0KSG9zdGVkIGJ5IHRoZSBbRGF0YSAmIFZpc3VhbGl6YXRpb24gU2VydmljZXNdKGh0dHA6Ly9saWJyYXJ5LmR1a2UuZWR1L2RhdGEvKSBEZXBhcnRtZW50LiAgDQoNClByZXNlbnRhdGlvbiBtYXRlcmlhbHMgY29tcG9zZWQgaW4gKlJtYXJrZG93biogdXNpbmcgKlJzdHVkaW8qLCBzdG9yZWQgaW4gYSAqR2l0aHViIFJlcG9zaXRvcnkqLCBTZXJ2ZWQgdmlhICpHaXRodWIgUGFnZXMqLiAgDQoNCiogZ2l0aHViIFJlcG8gLS0gaHR0cHM6Ly9naXRodWIuY29tL2xpYmpvaG4vci1hcGktanNvbiANCiogU2xpZGVzIC0tIGh0dHBzOi8vbGliam9obi5naXRodWIuY29tL3JjczIwMTcvc2xpZGVzLmh0bWwNCiogTm90ZWJvb2sgLS0gaHR0cDovL2xpYmpvaG4uZ2l0aHViLmlvL3JjczIwMTcvbm90ZWJvb2suaHRtbCANCg0KDQoNCiMjIE91dGxpbmUNCg0KKiBBUEkNCiogSlNPTg0KKiBSIC8gUlN0dWRpbw0KDQojIyBXaHk/DQoNCiMjIyBUaGUgV2ViIGhhcyBsb3RzIG9mIHN0dWZmDQorIGZyb250aWVyIGJleW9uZCBjdXJhdGVkIGRhdGFzZXRzDQorIHN0dWZmIGlzIHdyYXBwZWQgaW4gSFRNTA0KKyBIVE1MIGlzIHRyYW5zcG9ydGVkIG92ZXIgSFRUUCBidXQgY29tcG9zZWQgZm9yIGgybSBjb25zdW1wdGlvbg0KKyBJbnRlbGxlY3R1YWwgUHJvcGVydHkgcmlnaHRzIGJlYXIgc2VyaW91cyBjb25zaWRlcmF0aW9uDQoNCjwhLS0gTkFTQSBhbmltYXRlZCBHSUYgLy8vICBodHRwOi8vaS5naXBoeS5jb20vbDJKaHQ0bElmRVFmSjN6ajIuZ2lmICAgIC0tPiANCjwhLS0gIGdvb2QgaHVtYW4gaGFuZHNoYWtlIC8vLyAgaHR0cDovL2dpcGh5LmNvbS9naWZzL3Rob21hcy1VMlhib1J1Tjg5SWRpIC0tPg0KPCEtLSBhZnRlciB0aGUgcmVzZWFyY2ggaGFuZHNoYWtlIGlzIGNvbXBsZXRlIC8vLyBodHRwOi8vZ2lwaHkuY29tL2dpZnMvODBzLTE5ODBzLXRob21hcy1kb2xieS13Q0ttQmQ3b050QTRnICAtLT4gDQo8IS0tIHRoZSBjb25mdXNpb24gb2YgdGhlIG0ybSBoYW5kc2hha2UgLy8vICAgaHR0cDovL2dpcGh5LmNvbS9naWZzL3Rob21hcy1NamtDWWpNNDZOcnJPIC0tPg0KDQojIyBBUEkNCg0KIyMjIEFwcGxpY2F0aW9uIFByb2dyYW0gSW50ZXJmYWNlIA0KDQoqIEJ1aWx0IGZvciBtYWNoaW5lLXRvLW1hY2hpbmUgaW50ZXJhY3Rpb25zDQoqIEluc3RydWN0aW9ucyBmb3IgcHJvZ3JhbXMNCg0KPCEtLSBodHRwOi8vbW9iaWxlLWdwcy5uZXQvMjAxNS8wMS8gLS0+DQohW10oaW1hZ2VzL2FwaS5wbmcpDQoNCg0KLS0tICAgIA0KDQojIyMgQ2xpZW50IC8gU2VydmVyIA0KDQoNCiFbXShpbWFnZXMvQ2xpZW50LXNlcnZlci1tb2RlbC5zdmcucG5nKSANCg0KKiBNYWtlIFtSXSBpbnRlcmZhY2Ugd2l0aCB0aGUgd2ViDQoqIFNhbWUgYXMgaDJtIGJ1dCBub3cgbTJtDQoNCg0KPCEtLSBodHRwczovL3BpeGFiYXkuY29tL2VuL2NsaWVudC1zZXJ2ZXItbmV0d29ya2luZy1sYXB0b3AtMzQxNDIwLyAtLT4NCi0tLSAgDQoNCiMjIyBIdW1hbiBTaW11bGF0aW9uDQoNCiMjIyMgQSBkcmFtYXRpemF0aW9uLi4uDQoNCiogUGVyc29uIHVzZXMgV2ViIENsaWVudA0KICAgICsgUGVyc29uIGVudGVycyBhIFVSTDxicj4NCiAgICAhW10oaW1hZ2VzL1VSTC5QTkcpDQogICAgDQogICAgKyBjbGllbnQgJiBzZXJ2ZXIgbmVnb3RpYXRlPGJyPiANCiAgICAhW2RyYW1hdGl6YXRpb246IGdvb2QgaGFuZHNoYWtlXShpbWFnZXMvZ29vZC1oYW5kc2hha2UuZ2lmKSANCiAgICArIEluZm9ybWF0aW9uIGlzIHNlbnQgYmFjayBpbiB3cmFwcGVkIEhUTUwNCiAgICArIFdlYiBCcm93c2VyIHBhcnNlcyB0aGUgSFRNTCANCiAgICANCjwhLS0gaHR0cHM6Ly9jb21tb25zLndpa2ltZWRpYS5vcmcvd2lraS9GaWxlOlVuaWZvcm1fUmVzb3VyY2VfTG9jYXRvcl8oVVJMKV9leGFtcGxlLlBORyAtLT4NCjwhLS0gaHR0cHM6Ly9jb21tb25zLndpa2ltZWRpYS5vcmcvd2lraS9GaWxlOkhUTUwuc3ZnIC0tPg0KDQojIyBtMm0gLS0gZGV2ZWxvcG1lbnQNCg0KDQohW2RyYW1hdGl6YXRpb246IGNvbmZ1c2VkIGFib3V0IHRoZSBwcm90b2NvbF0oaW1hZ2VzL2RldmVsb3BtZW50LWNvbmZ1c2lvbi5naWYpDQogICAgDQojIyBKU09ODQoNCiogW0phdmFzY3JpcHQgT2JqZWN0IE5vdGF0aW9uXShodHRwczovL2VuLndpa2lwZWRpYS5vcmcvd2lraS9KU09OKSBpcyBhIGxhbmd1YWdlLWluZGVwZW5kZW50IGRhdGEgZm9ybWF0DQoqIEN1cnJlbnRseSB0aGUgbW9zdCBjb21tb24gZGF0YSBkYXRhIGZvcm1hdCBmb3IgYXN5bmNocm9ub3VzIGNsaWVudC9zZXJ2ZXIgY29tbXVuaWNhdGlvbiBmb3JtYXQNCiogQ29uc2lzdHMgb2Yga2V5LXZhbHVlIHBhaXJzDQoNCjwhLS0gaHR0cDovL2kudmltZW9jZG4uY29tL3ZpZGVvLzU0MTkzNTgxNl8xMjgweDcyMC5qcGcgLS0+DQo8IS0tIFZpbWVvIG9uIFdoYXQgaXMgSlNPTiAvLyBodHRwczovL3ZpbWVvLmNvbS8xNDQxNjIxMDIgLS0+DQoNCg0KYGBge2pzb24gZXhhbXBsZX0NCiMgZnJvbSBodHRwczovL2VuLndpa2lwZWRpYS5vcmcvd2lraS9KU09ODQp7DQogICJmaXJzdE5hbWUiOiAiSm9obiIsDQogICJsYXN0TmFtZSI6ICJTbWl0aCIsDQogICJpc0FsaXZlIjogdHJ1ZSwNCiAgImFnZSI6IDI1LA0KICAiYWRkcmVzcyI6IHsNCiAgICAic3RyZWV0QWRkcmVzcyI6ICIyMSAybmQgU3RyZWV0IiwNCiAgICAiY2l0eSI6ICJOZXcgWW9yayIsDQogICAgInN0YXRlIjogIk5ZIiwNCiAgICAicG9zdGFsQ29kZSI6ICIxMDAyMS0zMTAwIg0KICB9LA0KICAicGhvbmVOdW1iZXJzIjogWw0KICAgIHsNCiAgICAgICJ0eXBlIjogImhvbWUiLA0KICAgICAgIm51bWJlciI6ICIyMTIgNTU1LTEyMzQiDQogICAgfSwNCiAgICB7DQogICAgICAidHlwZSI6ICJvZmZpY2UiLA0KICAgICAgIm51bWJlciI6ICI2NDYgNTU1LTQ1NjciDQogICAgfSwNCiAgICB7DQogICAgICAidHlwZSI6ICJtb2JpbGUiLA0KICAgICAgIm51bWJlciI6ICIxMjMgNDU2LTc4OTAiDQogICAgfQ0KICBdLA0KICAiY2hpbGRyZW4iOiBbXSwNCiAgInNwb3VzZSI6IG51bGwNCn0NCmBgYA0KDQoNCiMjIEV4YW1wbGUNCg0KIyMjIE9NREIgYXBpIA0KDQotIGh0dHA6Ly93d3cub21kYi5vcmcvDQogICAgLSBsaWtlIGh0dHA6Ly9pbWRiLmNvbS8NCi0gbm8gQVBJIGtleXMgcmVxdXJpZWQNCi0gaHR0cDovL3d3dy5vbWRiYXBpLmNvbS8NCg0KLS0tIA0KDQojIyMgRGVtb25zdHJhdGlvbg0KDQoNCmBgYHtyIGxvYWQtbGlicmFyeS1wYWNrYWdlLCBtZXNzYWdlPUZBTFNFLCB3YXJuaW5nPVRSVUV9DQpsaWJyYXJ5KGpzb25saXRlKQ0KIyBodHRwczovL2NyYW4uci1wcm9qZWN0Lm9yZy93ZWIvcGFja2FnZXMvanNvbmxpdGUvdmlnbmV0dGVzL2pzb24tYWFxdWlja3N0YXJ0Lmh0bWwNCg0KIyBmb3IgYnVpbGRpbmcgdGliYmxlcw0KbGlicmFyeSh0aWR5dmVyc2UpDQpgYGANCg0KDQojIyMgU2luZ2xlIEpTT04gYXJyYXkNCldoZW4gdGhlIHNlcnZlciByZXNwb25zZSBpcyBhIHNpbmdsZSBKU09OIGFycmF5LCBKU09ObGl0ZSBtYWtlcyB2aWV3aW5nIHRoZSBkYXRhIHByZXR0eSBzaW1wbGUuDQpgYGB7ciBzaW5nbGVKU09OcmVzdWx0fQ0Kb25lSlNPTnJlc3VsdCA8LSBmcm9tSlNPTigiaHR0cDovL3d3dy5vbWRiYXBpLmNvbS8/dD1yb2NreSZ5PSZwbG90PWZ1bGwmcj1qc29uIikNCmBgYA0KDQpMZXQncyBzZWUgdGhlIHJlc3VsdHMgaW4gdGhlIG5leHQgc2xpZGUNCg0KLS0tDQoNCmBgYHtyfQ0Kb25lSlNPTnJlc3VsdA0KYGBgDQoNCg0KLS0tIA0KDQojIyMjIyBUaGUgdmVjdG9yIG9iamVjdCBiZWhhdmVzIGFzIHlvdSB3b3VsZCBleHBlY3QgaW4gUi4gIA0KDQotIFlvdSBjYW4gbGlzdCBhbGwgdGhlIHZhcmlhYmxlIG5hbWVzLg0KDQpgYGB7cn0NCm5hbWVzKG9uZUpTT05yZXN1bHQpDQpgYGANCg0KLSBMaXN0IGFuIGluZGl2aWR1YWwgZWxlbWVudA0KDQoNCmBgYHtyfQ0Kb25lSlNPTnJlc3VsdCRUaXRsZQ0KYGBgDQoNCmBgYHtyfQ0Kb25lSlNPTnJlc3VsdCRBd2FyZHMNCmBgYA0KDQoNCi0tLQ0KDQojIyMgQSBKU09OIE1hdHJpeA0KVGhlICoqcmVzdWx0cyBvZiB0aGlzIGNvZGUtc25pcHBldCByZWFjdCBkaWZmZXJlbnRseSoqIGJldHdlZW4gdGhlICpjb25zb2xlKiwgdGhlICpOb3RlYm9vayBzY3JpcHQqIChjb25zb2xlKSwgYW5kIHRoZSAqTm90ZWJvb2sgSFRNTCogb3V0cHV0LiAgSW4gdGhlIE5vdGVib29rIHNjcmlwdC1vdXRwdXQgeW91IGNhbiBmaW5kIHRoZSBjb21wb25lbnQgbmFtZSwgaW4gdGhpcyBjYXNlIGRvbGxhci1zZWFyY2g6IGAkU2VhcmNoYC4gIE9yLCB5b3UgY2FuIHVzZSBicmFja2V0IG5vdGF0aW9uOiBgW1sxXV1gLiAgT25jZSB5b3UgaWRlbnRpZnkgdGhlIGNvbXBvbmVudCBuYW1lLCBpdCdzIGVhc2llciB0byBpZGVudGlmeSB0aGUgZWxlbWVudCBuYW1lcy4NCmBgYHtyfQ0KanNvblNlcmllc1Jlc3V0bHNNYXRyaXggPC0gZnJvbUpTT04oImh0dHA6Ly93d3cub21kYmFwaS5jb20vP3M9cm9ja3kmdHlwZT1zZXJpZXMmcj1qc29uJnBhZ2U9MSIpDQpqc29uU2VyaWVzUmVzdXRsc01hdHJpeA0KYGBgDQoNCi0tLSAgDQoNCiMjIyBDYWxsIHRoZSBzZWFyY2ggcmVzdWx0cyBhbmQgY29lcmNlIHRoZSBKU09OIGFycmF5IGludG8gYSBkYXRhIGZyYW1lLg0KYGBge3J9DQpqc29uU2VyaWVzUmVzdXRsc01hdHJpeCRTZWFyY2gNCmBgYA0KDQotLS0gDQpgYGB7cn0NCmpzb25TZXJpZXNSZXN1dGxzTWF0cml4JFNlYXJjaCRUaXRsZQ0KYGBgDQoNCg0KIyMgUiBQYWNrYWdlcyAtLSBSZWxhdGVkDQoNCipQZW9wbGUgd2hvIHVzZSBKU09ObGl0ZSBhbHNvIHVzZS4uLioNCg0KKiBbaHR0cl0oaHR0cHM6Ly9jcmFuLnItcHJvamVjdC5vcmcvd2ViL3BhY2thZ2VzL2h0dHIvKSAtLSBjYWxscyBKU09ObGl0ZSBpbiBzZXJ2aWNlIHRvIG1ham9yIGdvYWwgb2YgbWFuYWdpbmcgSFRUUCANCiogW3J2ZXN0XShodHRwczovL2Jsb2cucnN0dWRpby5vcmcvMjAxNC8xMS8yNC9ydmVzdC1lYXN5LXdlYi1zY3JhcGluZy13aXRoLXIvKSAtLSAgdXNlZCBmb3IgaHRtbCBwYXJzaW5nDQoNCiMjIFJlc291cmNlcyANCg0KLSBSU3R1ZGlvIGh0dFIgdmlkZW8NCi0gSlNPTmxpdGUgcGFja2FnZQ0KLSBsaXN0b2YgaW1hZ2VzDQotIE1vdmllcyBvZiAxOTc2DQogICAgLSBbT01EQiBUb3AgTW92aWVzXShodHRwOi8vd3d3Lm9tZGIub3JnL2VuY3ljbG9wZWRpYS95ZWFyLzE5NzYvc3RhdGlzdGljcykNCiAgICAtIFtJTURCIE1vc3QgUG9wdWxhcl0oaHR0cDovL3d3dy5pbWRiLmNvbS95ZWFyLzE5NzYvKQ0KDQo=